水下成像是海洋机器人执行的一项关键任务,用于广泛的应用,包括水产养殖,海洋基础设施检查和环境监测。但是,水柱的影响(例如衰减和反向散射)会大大改变捕获的水下图像的颜色和质量。由于水条件的变化和这些影响的范围依赖性,恢复水下图像是一个具有挑战性的问题。这会影响下游感知任务,包括深度估计和3D重建。在本文中,我们推进了神经辐射场(NERFS)的最先进,以实现物理信息密集的深度估计和颜色校正。我们提出的方法Waternerf估计了水下图像形成的基于物理的模型的参数,从而导致混合数据驱动和基于模型的解决方案。在确定了场景结构和辐射场之后,我们可以产生降级和校正的水下图像的新颖观点,以及场景的密集深度。我们对实际水下数据集进行定性和定量评估所提出的方法。
translated by 谷歌翻译
神经辐射场(NERF)的最新进展实现了最新的新型视图合成,并促进了场景特性的密集估计。但是,在非常稀疏的视图下捕获的大型无界场景通常会失败,而场景内容集中在远离相机的情况下,这是典型的现场机器人应用程序。特别是,NERF风格的算法的性能很差:(1)当视图不足而呈姿势多样性的情况不足时,(2)当场景包含饱和度和阴影时,以及(3)当对具有精细结构的大型无界场景进行精心采样时,计算中就会大量强度。本文提出了克隆器,它通过允许从稀疏输入传感器视图中观察到的大型户外驾驶场景来对NERF进行显着改善。这是通过将NERF框架内的占用和颜色学习分离成分别使用LIDAR和相机数据训练的单独的多层感知器(MLP)来实现的。此外,本文提出了一种新的方法,可以在NERF模型旁边构建可区分的3D占用网格图(OGM),并利用此占用网格来改进沿射线的点采样,以在度量空间中进行体积渲染。通过在Kitti数据集的场景上进行的广泛定量和定性实验,本文表明,在新的视图合成和密集的深度预测任务上对稀疏输入数据培训时,所提出的方法在新型视图合成和密集的深度预测任务上都优于最先进的NERF模型。
translated by 谷歌翻译
Recent work has shown that fine-tuning large pre-trained language models on a collection of tasks described via instructions, a.k.a. instruction-tuning, improves their zero and few-shot generalization to unseen tasks. However, there is a limited understanding of the performance trade-offs of different decisions made during the instruction-tuning process. These decisions include the scale and diversity of the instruction-tuning benchmark, different task sampling strategies, fine-tuning with and without demonstrations, training using specialized datasets for reasoning and dialogue, and finally, the fine-tuning objectives themselves. In this paper, we characterize the effect of instruction-tuning decisions on downstream task performance when scaling both model and benchmark sizes. To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks. Through the lens of this framework, we first present insights about instruction-tuning decisions as applied to OPT-30B and further exploit these insights to train OPT-IML 30B and 175B, which are instruction-tuned versions of OPT. OPT-IML demonstrates all three generalization abilities at both scales on four different evaluation benchmarks with diverse tasks and input formats -- PromptSource, FLAN, Super-NaturalInstructions, and UnifiedSKG. Not only does it significantly outperform OPT on all benchmarks but is also highly competitive with existing models fine-tuned on each specific benchmark. We release OPT-IML at both scales, together with the OPT-IML Bench evaluation framework.
translated by 谷歌翻译
Legal contracts, such as employment or lease agreements, are important documents as they govern the obligations and entitlements of the various contracting parties. However, these documents are typically long and written in legalese resulting in lots of manual hours spent in understanding them. In this paper, we address the task of summarizing legal contracts for each of the contracting parties, to enable faster reviewing and improved understanding of them. Specifically, we collect a dataset consisting of pairwise importance comparison annotations by legal experts for ~293K sentence pairs from lease agreements. We propose a novel extractive summarization system to automatically produce a summary consisting of the most important obligations, entitlements, and prohibitions in a contract. It consists of two modules: (1) a content categorize to identify sentences containing each of the categories (i.e., obligation, entitlement, and prohibition) for a party, and (2) an importance ranker to compare the importance among sentences of each category for a party to obtain a ranked list. The final summary is produced by selecting the most important sentences of a category for each of the parties. We demonstrate the effectiveness of our proposed system by comparing it against several text ranking baselines via automatic and human evaluation.
translated by 谷歌翻译
Project Loon is a Google initiated research project from the Google X Lab. The project focuses on providing remote internet access and network connectivity. The connectivity is established in vertical and horizontal space; vertical connectivity between Google Access Point (GAP) and the balloons, and between balloons and antennas installed at land; horizontal connectivity is between the balloons. This research focuses on the connectivity between the balloons in a mesh network. The proposal focuses on implementing graphical methods like convex hull with adhoc communication protocols. The proposed protocol includes content-based multicasting using angular sector division rather than grids, along with dynamic core-based mesh protocol defining certain core active nodes and passive nodes forming the convex hull. The transmission (multicasting and broadcasting) between the nodes will be evaluated using the link probability defining the probability of the link between two nodes failing. Based on the link probability and node features, best path between transmitting and receiver nodes will be evaluated.
translated by 谷歌翻译
Climate change, population growth, and water scarcity present unprecedented challenges for agriculture. This project aims to forecast soil moisture using domain knowledge and machine learning for crop management decisions that enable sustainable farming. Traditional methods for predicting hydrological response features require significant computational time and expertise. Recent work has implemented machine learning models as a tool for forecasting hydrological response features, but these models neglect a crucial component of traditional hydrological modeling that spatially close units can have vastly different hydrological responses. In traditional hydrological modeling, units with similar hydrological properties are grouped together and share model parameters regardless of their spatial proximity. Inspired by this domain knowledge, we have constructed a novel domain-inspired temporal graph convolution neural network. Our approach involves clustering units based on time-varying hydrological properties, constructing graph topologies for each cluster, and forecasting soil moisture using graph convolutions and a gated recurrent neural network. We have trained, validated, and tested our method on field-scale time series data consisting of approximately 99,000 hydrological response units spanning 40 years in a case study in northeastern United States. Comparison with existing models illustrates the effectiveness of using domain-inspired clustering with time series graph neural networks. The framework is being deployed as part of a pro bono social impact program. The trained models are being deployed on small-holding farms in central Texas.
translated by 谷歌翻译
ML-based motion planning is a promising approach to produce agents that exhibit complex behaviors, and automatically adapt to novel environments. In the context of autonomous driving, it is common to treat all available training data equally. However, this approach produces agents that do not perform robustly in safety-critical settings, an issue that cannot be addressed by simply adding more data to the training set - we show that an agent trained using only a 10% subset of the data performs just as well as an agent trained on the entire dataset. We present a method to predict the inherent difficulty of a driving situation given data collected from a fleet of autonomous vehicles deployed on public roads. We then demonstrate that this difficulty score can be used in a zero-shot transfer to generate curricula for an imitation-learning based planning agent. Compared to training on the entire unbiased training dataset, we show that prioritizing difficult driving scenarios both reduces collisions by 15% and increases route adherence by 14% in closed-loop evaluation, all while using only 10% of the training data.
translated by 谷歌翻译
We study politeness phenomena in nine typologically diverse languages. Politeness is an important facet of communication and is sometimes argued to be cultural-specific, yet existing computational linguistic study is limited to English. We create TyDiP, a dataset containing three-way politeness annotations for 500 examples in each language, totaling 4.5K examples. We evaluate how well multilingual models can identify politeness levels -- they show a fairly robust zero-shot transfer ability, yet fall short of estimated human accuracy significantly. We further study mapping the English politeness strategy lexicon into nine languages via automatic translation and lexicon induction, analyzing whether each strategy's impact stays consistent across languages. Lastly, we empirically study the complicated relationship between formality and politeness through transfer experiments. We hope our dataset will support various research questions and applications, from evaluating multilingual models to constructing polite multilingual agents.
translated by 谷歌翻译
We are interested in neurosymbolic systems consisting of a high-level symbolic layer for explainable prediction in terms of human-intelligible concepts; and a low-level neural layer for extracting symbols required to generate the symbolic explanation. Real data is often imperfect meaning that even if the symbolic theory remains unchanged, we may still need to address the problem of mapping raw data to high-level symbols, each time there is a change in the data acquisition environment or equipment. Manual (re-)annotation of the raw data each time this happens is laborious and expensive; and automated labelling methods are often imperfect, especially for complex problems. NEUROLOG proposed the use of a semantic loss function that allows an existing feature-based symbolic model to guide the extraction of feature-values from raw data, using `abduction'. However, the experiments demonstrating the use of semantic loss through abduction appear to rely heavily on a domain-specific pre-processing step that enables a prior delineation of feature locations in the raw data. We examine the use of semantic loss in domains where such pre-processing is not possible, or is not obvious. We show that without any prior information about the features, the NEUROLOG approach can continue to predict accurately even with substantially incorrect feature predictions. We show also that prior information about the features in the form of even imperfect pre-training can help correct this situation. These findings are replicated on the original problem considered by NEUROLOG, without the use of feature-delineation. This suggests that symbolic explanations constructed for data in a domain could be re-used in a related domain, by `feature-adaptation' of pre-trained neural extractors using the semantic loss function constrained by abductive feedback.
translated by 谷歌翻译
Breaking down a document or a conversation into multiple contiguous segments based on its semantic structure is an important and challenging problem in NLP, which can assist many downstream tasks. However, current works on topic segmentation often focus on segmentation of structured texts. In this paper, we comprehensively analyze the generalization capabilities of state-of-the-art topic segmentation models on unstructured texts. We find that: (a) Current strategies of pre-training on a large corpus of structured text such as Wiki-727K do not help in transferability to unstructured texts. (b) Training from scratch with only a relatively small-sized dataset of the target unstructured domain improves the segmentation results by a significant margin.
translated by 谷歌翻译